Search Algorithms

Part 3 - Breadth First Algorithm

We solved path finding problem for a tree) in our last section. Now we will try for a graph which would have closed loops. In reality, if you think of any map, there are always routes naturally forming closed loops. Below is our problem, now adapted further to be a graph (in fact, structurally this is similar to original AIMA map). Take note of the new closed loops.

In [1]:
from docHelpers_ipython import Doc, romania_location_map_full
from IPython.display import display, Image

doc = Doc(romania_location_map_full)
mappy = doc.showMap()
display(Image(mappy))

The problem statement: find a route from Arad to Bucharest

As usual, we will use the short hand notation, for easier handling and understanding..

In [2]:
from docHelpers_ipython import romania_location_map

doc = Doc(romania_location_map)
mappy = doc.showMap()
display(Image(mappy))

Arbitrary graphs with cycles

Check out our map description..

In [3]:
romania_location_map
Out[3]:
{'A': {'pos': (21.31227, 46.18656), 'connections': ['S', 'T', 'Z']},
 'S': {'pos': (24.12558, 45.79833), 'connections': ['F', 'RV', 'O']},
 'Z': {'pos': (21.51742, 46.62251), 'connections': ['O']},
 'T': {'pos': (21.20868, 45.74887), 'connections': ['LU']},
 'O': {'pos': (21.91894, 47.0465), 'connections': []},
 'F': {'pos': (24.9731, 45.84164), 'connections': ['B']},
 'LU': {'pos': (21.90346, 45.69099), 'connections': ['M']},
 'RV': {'pos': (24.36932, 45.09968), 'connections': ['C', 'P']},
 'M': {'pos': (22.36452, 44.90411), 'connections': ['D']},
 'D': {'pos': (22.65973, 44.63692), 'connections': ['C']},
 'C': {'pos': (23.79488, 44.33018), 'connections': []},
 'P': {'pos': (24.86918, 44.85648), 'connections': ['B', 'C']},
 'B': {'pos': (26.10254, 44.42677), 'connections': ['G', 'U']},
 'G': {'pos': (25.96993, 43.90371), 'connections': []},
 'U': {'pos': (26.64112, 44.71653), 'connections': ['H', 'V']},
 'V': {'pos': (27.72765, 46.64069), 'connections': ['LA']},
 'LA': {'pos': (27.60144, 47.15845), 'connections': ['N']},
 'N': {'pos': (26.38188, 46.97587), 'connections': []},
 'H': {'pos': (27.94566, 44.68935), 'connections': ['E']},
 'E': {'pos': (28.65273, 44.04911), 'connections': []}}

If you look at above description, we have carefully avoided non repeating nodes though forming cycles.
For eg, A's connections are S, T, Z but S's connections do not include A. In reality, we always cannot ensure this. Arbitrary graphs might have cycles, like here, S again referring back to A. What happens to our algo in that case? Let us try a modified map description with A included as S's neighbor. In fact that is not wrong. We are just including all the connected nodes to node 'S'.

In [4]:
romania_location_map_CYCLED = {
    'A' : {'pos': (21.31227,46.18656), 'connections': ['S','T','Z'] },
    'S' : {'pos': (24.12558,45.79833), 'connections': ['A','F','RV','O'] },  # Note 'A' included here
    'Z' : {'pos': (21.51742,46.62251), 'connections': ['O'] },
    'T' : {'pos': (21.20868,45.74887), 'connections': ['LU'] },
    'O' : {'pos': (21.91894,47.04650), 'connections': [] },
    'F' : {'pos': (24.97310,45.84164), 'connections': ['B'] },
    'LU' : {'pos': (21.90346,45.69099), 'connections': ['M'] },
    'RV' : {'pos': (24.36932,45.09968), 'connections': ['C','P'] },
    'M' : {'pos': (22.36452,44.90411), 'connections': ['D'] },
    'D' : {'pos': (22.65973,44.63692), 'connections': ['C'] },
    'C' : {'pos': (23.79488,44.33018), 'connections': [] },
    'P' : {'pos': (24.86918,44.85648), 'connections': ['B','C'] },
    'B' : {'pos': (26.10254,44.42677), 'connections': ['G','U'] },
    'G' : {'pos': (25.96993,43.90371), 'connections': [] },
    'U' : {'pos': (26.64112,44.71653), 'connections': ['H','V'] },
    'V' : {'pos': (27.72765,46.64069), 'connections': ['LA'] },
    'LA' : {'pos':(27.60144,47.15845), 'connections': ['N'] },
    'N' : {'pos': (26.38188,46.97587), 'connections': [] },
    'H' : {'pos': (27.94566,44.68935), 'connections': ['E'] },
    'E' : {'pos': (28.65273,44.04911), 'connections': [] }
}

Let us try our algo now on this new map..(since it would expectedly results in error, I have code blocked and displayed the result screenshot)

Input:

from collections import deque

cameFrom = { }  # This has to be globally accessible for reconstructPath to 

def reconstructPath(current_node):
    print(cameFrom)
    Path = [current_node]
    while current_node is not None:
        current_node = cameFrom[current_node] # Now current node would become 'A'
        Path.append(current_node)
    return reversed(Path[:-1]) # trimming    

def ourSearchAlgo(start, goal): # Wait, we havent named yet?

    # INITIALIZATION
    openSet = deque()
    openSet.append(start)  
    cameFrom[start] = None

    # MAIN LOOP
    while openSet: #eventual failure exit

        current_node = openSet.popleft()    # FIFO 

        if current_node is goal:   # successful exit
            print('Success. Route from {} to {} found. Path: {}'.format(start,goal,list(reconstructPath(current_node))))
            break

        for each_neighbor in M.get(current_node,[]).get('connections',[]):  # add neighbors        
            openSet.append(each_neighbor)                  
            cameFrom[each_neighbor] = current_node      

        print(list(openSet))

    return 'No Goal found!'


M = romania_location_map_CYCLED  # TRYING OUT OUR NEW MAP..
start = 'A'
goal = 'B'
result = ourSearchAlgo(start, goal)

Output:

2018-05-19_16h35_40.png

Note how 'A' re appears in the queue and messes up the execution, finally also leading to memory error. Could have led to infinite loop, luckily we got 'B' before that (note 'A' waiting still in queue)

We could have avoided this mess, if we some how kept track of the visited/processed nodes. Just like cameFrom, we just shall do that - create a list of visited nodes (say, closedSet), and process only if we have not visited earlier (or not in the visited nodes list).

Since visited nodes, should always thus contain unique values (as we will not even process visited nodes again), thanks to python, we could use Python set()

Food for thought: We initialize closedSet also with 'A', that is start, though openSet also initialized with same. Why? (answer in comments below)

In code below, only 2 lines added for this new change indicated by comments starting with # GRAPH

Let us now execute and check..

In [5]:
from collections import deque

cameFrom = { }  # This has to be globally accessible for reconstructPath to 

def reconstructPath(current_node):
    Path = [current_node]
    while current_node is not None:
        current_node = cameFrom[current_node] # Now current node would become 'A'
        Path.append(current_node)
    return reversed(Path[:-1]) # trimming    

def ourSearchAlgo(start, goal): # Wait, we havent named yet?

    # INITIALIZATION
    openSet = deque()
    openSet.append(start)  
    cameFrom[start] = None
    closedSet = set(start)   # GRAPH: we never get to add 'start' in closed set once in main loop, so we do now!             

    # MAIN LOOP
    while openSet: #eventual failure exit

        current_node = openSet.popleft()    # FIFO 

        if current_node is goal:   # successful exit
            print('Success. Route from {} to {} found. Path: {}'.format(start,goal,list(reconstructPath(current_node))))
            break

        all_neighbors = M.get(current_node,[]).get('connections',[])
        for each_neighbor in all_neighbors:  # add neighbors    
            if each_neighbor not in closedSet:  # GRAPH: add to queue only if not already visited
                openSet.append(each_neighbor)                  
                cameFrom[each_neighbor] = current_node    
                closedSet.add(each_neighbor)


    return 'No Goal found!'


M = romania_location_map_CYCLED 
start = 'A'
goal = 'B'
result = ourSearchAlgo(start, goal)
Success. Route from A to B found. Path: ['A', 'S', 'F', 'B']

Voila! We did it again. It worked. We just created a BFS algorithm also for graphs!

NOTE: With completing BFS, we have laid important foundation for building more algorithms. We maintain parent via cameFrom, we maintain visited nodes via closedSet using as set() so as to be unique, we use python's queue setup for openSet, these are all important constructs to be remembered with purpose, as in upcoming sections, these constructs would be already present, and should not be confusing you anyway. Make sure, you understand them like, instinctively! Repeat. Try applying our approach for another search problem.

Visualization (Optional)

We will inject a small code to keep track of bag contents, nodes traversed for sake of visualization. And then render an animation to visualize how nodes were traversed to reach the goal.

In [6]:
from collections import deque
from docHelpers_ipython import Doc

# VISUALIZATION PURPOSE
from IPython.display import display, Image

cameFrom = { }  # This has to be globally accessible for reconstructPath to 

def reconstructPath(current_node):
    Path = [current_node]
    while current_node is not None:
        current_node = cameFrom[current_node] # Now current node would become 'A'
        Path.append(current_node)
    return reversed(Path[:-1]) # trimming    

def BFS(start, goal): 

    # INITIALIZATION
    openSet = deque()
    openSet.append(start)  
    cameFrom[start] = None
    closedSet = set(start)   # GRAPH: we never get to add 'start' in closed set once in main loop, so we do now!             

    # MAIN LOOP
    while openSet: #eventual failure exit

        current_node = openSet.popleft()    # FIFO 

        if current_node is goal:   # successful exit
            print('Success. Route from {} to {} found. Path: {}'.format(start,goal,list(reconstructPath(current_node))))
            break

        # VISUALIZATION PURPOSE
        all_neighbors = M.get(current_node,[]).get('connections',[])
        considered_neighbors = list(set(all_neighbors) - set(closedSet)) # thank you: https://stackoverflow.com/questions/3462143/get-difference-between-two-lists

        for each_neighbor in M.get(current_node,[]).get('connections',[]):  # add neighbors                
            if each_neighbor not in closedSet:  # GRAPH: add to queue only if not already visited
                openSet.append(each_neighbor)                  
                cameFrom[each_neighbor] = current_node    
                closedSet.add(each_neighbor)

        # VISUALIZATION PURPOSE
        _ = doc.computeGraphs(current_node, considered_neighbors)

    # VISUALIZATION PURPOSE     
    _ = doc.computeGraphs(current_node, [])            
    return 'No Goal found!'


M = romania_location_map_CYCLED 
start = 'A'
goal = 'B'

# VISUALIZATION PURPOSE - called here caz we use doc in ourSearchAlgo
doc = Doc(M) 

result = BFS(start, goal)

# VISUALIZATION PURPOSE
images = doc.render()
display(Image(images[0]),Image(images[1]),Image(images[2]))
Success. Route from A to B found. Path: ['A', 'S', 'F', 'B']